Exceeding classical capacity limit in quantum optical channel 
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The amount of information transmissible through a communications channel is determined by 
the noise characteristics of the channel and by the quantities of available transmission resources. In 
classical information theory, the amount of transmissible information can be increased twice at most 
when the transmission resource (e.g. the code length, the bandwidth, the signal power) is doubled 
for fixed noise characteristics. In quantum information theory, however, the amount of information 
transmitted can increase even more than twice. We present a proof-of-principle demonstration of 
this super-additivity of classical capacity of a quantum channel by using the ternary symmetric 
states of a single photon, and by event selection from a weak coherent light source. We also show 
how the super-additive coding gain, even in a small code length, can boost the communication 
performance of conventional coding technique. 

PACS numbers: 03.67.Hk, 03.65.Ta, 42.50.-p 



In any transmission of signals at the quantum level, 
such as a long-haul optical communication where the 
signals at the receiving end are weak coherent pulses, 
ambiguity among signals may be more a matter of non- 
commutativity of quantum states, i.e. popi 7^ pipo rather 
than any classical noise. Such states can never be dis- 
tinguished perfectly even in principle. This imposes an 
inevitable error in signal detection even in an ideal com- 
munications system. It was only recently that commu- 
nication theory was extended into quantum domain to 
include this aspect of ambiguity, and the expressions of 
channel capacity were finally obtained 1]. Classical com- 
munication theory 2] describes the special case of the 
signals prepared in commuting density matrices. 

For reliable transmission in the presence of noise, re- 
dundancy must be introduced in representing messages 
by letters, such as {0, 1}, so as to correct errors at the 
receiving side. The capacity is associated with the func- 
tional meaning of this channel coding. Messages of k 
[bit] are encoded into block sequences of given letters in 
length n (> k). The n — k [bit] redundancy allows one to 
correct errors at the receiving side. For a channel with 
a capacity C [bit/letter], it is possible 2] with the rate 
R = k/n < C to reproduce k bit messages with an error 
probability as small as desired by appropriate encoding 
and decoding in the limit n ^ 00. 

In extending the theory of capacity into quantum do- 
main, primary concern is decoding of codewords made of 
non-commuting density matrices of letters. This is a non- 
trivial problem of quantum measurement. Actually, the 
optimal decoding essentially uses a process of entangling 
letter states constituting codewords prior to the measure- 
ment to enhance the distinguishability of signals. Such a 
process is nothing but a quantum computation on code- 
word states. This is a new aspect, not found in conven- 
tional coding techniques, and leads to a larger capacity. 
A significant consequence of this so called quantum col- 
lective decoding, is that the capacity can increase even 



more than twice when the code length is doubled. In 
classical information theory, on the contrary, the capac- 
ity can be increased twice at most. This feature, the 
super-additive quantum coding gain 0, 0, 0, 1^ f^, will 
be an important design rule for communications at the 
quantum level. 

The theory of capacity, however, generally gives no 
guidance on how to construct codes that approach the 
capacity. The practical problem is then to find good 
codes for small blocks. Although several coding schemes 
have beenproposed to exhibit super-additive coding gain 
0, 0, 0, U , little attention has been paid to this topic 
so far, and no experimental work has been reported 
yet. In fact, putting these theoretical predictions into 
practice has been considered as a formidable task with 
present technologies. In this letter, we experimentally 
demonstrate the super-additive coding gain by designing 
a coding circuit for a quantum channel consisting of the 
ternary symmetric states in a two-state system (qubit) 
of a single photon. 

For binary non-orthogonal pure states, the most basic 
signals, the super-additive coding gain is predicted 5] for 
the minimum length, n = 2. The amount of gain, how- 
ever, is so small to be observed experimentally, that is, 
5.2 xlO""^ [bit] as the net increase of retrievable informa- 
tion per letter from the classical limit. For n = 3, the net 
gain of 0.009 [bit] is predicted however, this requires 
quantum gating more than ten steps with high precision, 
which is something hard to do. Therefore we consider the 
letter state set that shows the largest amount of the cod- 
ing gain with the minimum code length, n = 2, among 
the known codes d, 0, . 

Let us consider the set of the ternary letters {0, 1, 2} 
conveyed by the symmetric states of a qubit system, 

P. = with iv^o) = io), = -^io) - 

|V^2) = -^|0) + ^|1). Here{|0), |1)} is the orthonormal 
basis set. We assume that these states arrive at the re- 
ceiver's hand through a noiseless transmission line. If the 
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FIG. 1: Geometrical representation of the codeword (dotted 
arrows) and decoding (solid arrows) state vectors in a real 
three dimensional space. 



letter states were prepared in commuting density matri- 
ces, they could be distinguished perfectly, and log2 3 [bit] 
of information (the maximum Shannon entropy of the set 
{0, 1, 2}) could be faithfully retrieved per letter, meaning 
that the capacity would be log2 3 [bit/letter]. However, 
the states px here are non-commuting, and distinguish- 
ing them is always associated with finite errors. In fact, 
the average error probability can never be lower than 1/3 
when they are used with equal prior probabilities 8]. 

The capacity is matheatically given 2] based on the 
mutual information I{X : Y) which is defined from 
the input variable X = {x}, the output variable Y = 
{?/}, the prior distribution {P{x)} of X and the con- 
ditional probability {P{y\x)} of Y for given X. For 
the given channel model [P{y\x)]^ the capacity is de- 
fined by C = max|p(2:)| I{X : Y). In the quantum con- 
text, on the other hand, only the input variable X and 
the corresponding set of quantum states {px} at the re- 
ceiver's hand are given. The output variable Y is to be 
sought for the best quantum measurement, i.e. a POVM 
{n^}. The channel matrix elements are now given by 
P{y\x) = TT{IlyPx), and one is to find the optimized 
quantity 9] 



Ci = max max/(X : Y). 

m-)}{n,} 



(1) 



For the ternary set {lip x)}^ the Ci was evaluated as 0.6454 
[bit/letter] which is attained by using only two of the 
three letters, say {I'^o)^ IV^i)}? with equal probability 1/2 
and by applying the binary measurement to form a bi- 
nary symmetric channel 10]. The quantity Ci is, how- 
ever, not the ultimate capacity allowed by quantum me- 
chanics. In fact Ci specifies the classical capacity limit 
when the given initial channel is used with classical chan- 
nel coding . It is this quantity that limits the perfor- 
mance of all conventional communications systems. 

Now let us consider a quantum channel coding of 
length two. There are nine possible sequences in length 
two coding of three letters. Peres and Wootters showed 
H that /(X^ : y^) = 1.3690 [bit] of information can 
be retrieved in principle, which is greater than twice 
of the classical limit 2Ci = 1.2908 [bit]. This can 
be achieved in the following way; only three sequences 



l^^xx) = \i^x) ^ \i^x) {x = 0, 1, 2) are used as the 
codewords with equal probability 1/3, and they are de- 
coded by the measurement represented by the elements 
Ilyy = \Ilyy){Ilyy\ (?/ = 0, 1, 2) wMch composc thc or- 
thonormal basis expanding {l^^^a^)}, that is, 

IM/oo) = c inoo) - ^ inn) - |n22), (2a) 
|*n) = |noo>+c|nn)-^ |n22), (2b) 
IM/22) = inoo) - ^ inn) + c |n22), (2c) 

where c = cos^ = (>/2 + 1)/a/6, and s = sin^ = 
{V2 - 1)/V6 (7 19.47°). Fig. n shows a geometri- 
cal representation of Eq. The super-additive coding 
gain is /(X^ : 1^2)^2 - Ci = 0.0391 [bit/letter]. 

To demonstrate this gain, we must be able to entangle 
two letter states at the receiver's hand prior to a measure- 
ment. Unfortunately quantum gating operations demon- 
strated to date are not precise enough to observe the 
small quantum coding gain. Therefore our method for 
proof-of-principle demonstration is based on the use of 
two physically different kinds of qubits of a single pho- 
ton. The first and second letters of a codeword are drawn 
from the ternary letter state sets made of a polarization 
and a location qubits, respectively. Then entangling the 
polarization and location degrees of freedom of a pho- 
ton can be performed by linear optical components with 
very high accuracy. The polarization qubit consists of 
the horizontal \H) and vertical \ V) polarization states of 
a single photon. The location qubit for the second letter 
is realized by guiding the polarization qubit into two dif- 
ferent paths A and B through a polarizing beamsplitter 
(PBS) which reflects the vertical polarization and trans- 
mits the horizontal polarization. Thus, the length two 
codin^space is spanned by the two orthonormal basis 
sets ji3 

|00) = |O)p0|O)L = |i^)A0|vacuum)B, (3a) 
|01) = |O)p0|l)L = |vacuum)A0|i^)B, (3b) 
|10) = |l)p0|O)L = |V)A^|vacuum)B, (3c) 
111) = |l)p0|l)^ = |vacuum)^0|F)p. (3d) 

The codeword states {"^xx) = \i^x)p ^ \'^x)l are repre- 
sented in this product space. Thus the increase in re- 
sources in our coding format is due to doubling the spatial 
resource which is analogous to doubling the transmission 
bandwidth, as opposed to doubling the number of polar- 
ized photons. We want then to observe the increase more 
than double the amount of information transmitted. 

An optical circuit for this coding is shown in Fig.[2{a). 
The angles of the three half waveplates (HWPs) ^'s 
are chosen as (l9o, 6>i, O2) =(0°, 0°, 0°), (30°, -30°, -15°), 
and (30°, 30°, 15°) for |^oo), l^ii), and 1^22), respec- 
tively. This decoding circuit is derived with slight modi- 
fications from a general circuit design of Fig.Efb), which 
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FIG. 2: (a) Encoding and decoding circuits. The angles of 
the HWPs, ^0, ^1, and O2 are chosen as described in the text, 
and = -7/2 = -9.74° and 0s = -45°. (b) Quantum 
circuit to reaUze the cohective decoding by {111^^)}, which 
can be appUed to any physical qubits. A received codeword 
state is first transformed by the five controlled gates, and is 
then detected by a standard von Neumann measurement on 
each letter. The open circle indicates conditioning on the 
control qubit being set to zero, and Q{(^) = Ry{y^)£z, and 
7 = 19.47°. Other nomenclature follows the Ref. [l^. (c) 
Circuit for separable (classical) decoding for Ci. 



can be applied to any physical qubits. The received code- 
word is decided to be either of |^oo), I ^11)7 oi" 1^22) ac- 
cording to the detection of a photon by the avalanche 
photodetector APDO, APDl, or APD2, respectively. 

In our experiment, the CW light from a He-Ne laser at 
the wavelength of 632.8 nm with 1 mW power is strongly 
attenuated such that about 10~^ photons exist on av- 
erage in the whole circuit. The signal photons are 
guided to the Si APDs whose quantum efficiency and 
dark count are typically 70% and 100 [count/sec], re- 
spectively, through a multimode optical fiber with cou- 
pling efficiency of about 80%. In this setup, the mutual 
information is evaluated by constructing the 3-by-3 chan- 
nel matrix [P{yy\xx) = \{Ilyy\'^xx)\'^] from a statistical 
data of single-photon events detected by either of the 
three APDs conditioned on an input codeword {"^xx)- 
The mutual information thus obtained measures the ra- 
tio of number of bits retrieved per number of total pho- 
ton counts. This allows us to simulate communications 
of "pure" codeword states of two letters by sending and 
detecting the photons one by one through the channel. 
The error performance is then determined only by the 
non-commutativity of the signal states, imperfect align- 
ment of the whole interferometer, and the dark count of 
the APDs. 



Each polarization Mach-Zehnder interferometer must 
be adjusted simultaneously at a proper operating point. 
This is done by using a bright reference beam and Piezo 
transducers with low noise voltage sources. The visibil- 
ity of the whole interferometer is typically 98%. Once 
the circuit is adjusted, the reference beam is shut off. 
The signal light is then guided into the encoder and de- 
coder. Photon counts are measured for five-second du- 
ration. This procedure is repeated for each codeword, 
composing a full sequence of measuring the channel ma- 
trix. The temporal stability in this sampling mode cor- 
responds to the relative path length change within 3 nm 
for at least more than 200 sec, which causes the error in 
mutual information within ±0.005 [bit]. 

An example of the channel matrix measured is shown 
as a histogram in Fig. [31 Ideally, the diagonal and off- 
diagonal elements must be =0.9714 and 5^/2 =0.0143, 
respectively. The total events counted for 1 sec is of order 
10^, while the average count for the off-diagonal elements 
is about 1.9 X 10^. The background photons amount to 
about 300 [count/sec]. Including dark counts, the to- 
tal background photon count is 2% of the average count 
for the off-diagonal elements. The mutual information 
is evaluated as I{X^ : Y^) = 1.312 ± 0.005 [bit]. For 
experimental clarity, we measured the variation of the 
mutual information when the codeword state set {l^a^a^)} 
is rotated with respect to the decoder state set (In^y^)} 
around the vertical axis in Fig. ^ The result is shown in 
Fig. El The gap between the data points (diamonds) and 
the ideal curve (solid curve) is mainly attributed to the 
imperfection of the PBSs. Fluctuation of the data points 
are mainly due to thermal drifts. The corresponding er- 
ror bars (~ ±0.005 [bit]) are about the same size of the 
diamonds. The measured mutual information per letter, 
0.656 ±0.003 [bit/letter], is clearly greater than the clas- 
sical theoretical limit Ci = 0.6454 [bit/letter], which is 
the level shown by the dashed lines. The white square 
represents the experimental Ci, 0.644 ±0.001 [bit/letter]. 
This is measured by the circuit for classical decoding of 
Fig. [2fc), which does not entangle the polarization and 
location qubits. The retrieved information can never ex- 
ceed 2Ci. Our results clearly show that when an ap- 
propriate quantum circuit for entangling the letter states 
is inserted in front of the separable decoding, one can 
retrieve information more than twice per letter. 

The super-additive coding gain in small blocks is not 
only valuable as a proof-of-principle demonstration but 
also of practical importance in quantum- limited com- 
munications. Even a two-qubit quantum circuit like 
Fig. I2fb) is useful in boosting the performance of a clas- 
sical decoder. It can be shown that the decoding error 
can be greatly reduced by inserting the quantum circuit 
in front of the classical decoder. The quantum circuit 
processes a received codeword state quantum collectively 
prior to converting it into a classical signal. We call such 
a scheme quantum-classical hybrid coding (QCHC). The 
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TABLE I: Error exponent E(R) of QCHC and ACC. 
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tenuated coherent states {|q^/c)} of phase-shift and/or 
amphtude-shift keying. Unhke our channel model of sin- 
gle photon polarization and location modes, one must be 
able to entangle weak coherent pulses with respect to the 
degrees of phase and/or amplitude. This is another big 
challenge. 

The authors acknowledge J. A. Vaccaro for his valuable 
comments. This work was supported by the CREST of 
JST. 



FIG. 3: The upper: Histogram of photon counts for the 
channel matrix elements P{yy\xx) corresponding to the max- 
imum mutual information. The lower: Measured (diamonds) 
and theoretical (solid curve) mutual information as a function 
of the offset angle of the codeword state set from the 

decoder state set {|Hy^)} around the vertical axis in Fig. Q] 
The dotted curve is just the guide for eyes. The theoretical 
Ci and accessible information /acc are shown by the dashed 
and one-dotted lines, respectively T^. The square represents 
the experimental Ci. 

theoretical error exponents 2] of QCHC and all-classical 
coding (ACC) in the ternary letter-state case are listed 
in Table ^ for low and high transmission rates R. The 
improvement is more drastic in the higher rate limit. For 
the rate R = 0.62 [bit/letter] (96% of Ci), it is possible to 
reduce the decoding error as Pe = 2"t^(^) = 2-o o488n 
by an appropriate classical coding with the composite 
letters {00, 11, 22} assisted by the pair- wise quantum 
decoding. To achieve a standard error-free criterion 
Pe = 10-^, QCHC requires the code length n = 614 (307 
composite letter pairs), whereas ACC typically needs 
n = 57300. As codes get longer, the complexity of the de- 
coder, such as the total number of arithmetic operations, 
increases and eventually limits the effective transmission 
speed. For some asymptotically good codes, the total 
number of arithmetic operations is evaluated 14] to be 
typically of order (nlog n)^. Then the reduction of code 
length attained by QCHC will be practically significant 
in the trade-off between performance and decoding com- 
plexity. This suggests a useful application of small scale 
quantum computation. 

The super-additive quantum coding gain should even- 
tually be applied to more practical resources such as 
optical pulses of coherent states, especially, heavily at- 
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